NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Integration of nested cross-validation, automated hyperparameter optimization, high-performance computing to reduce and quantify the variance of test performance estimation of deep learning models

https://doi.org/10.1016/j.cmpb.2025.109063

Calle, Paul; Bates, Averi; Reynolds, Justin C; Liu, Yunlong; Cui, Haoyang; Ly, Sinaro; Wang, Chen; Zhang, Qinghao; de_Armendi, Alberto J; Shettar, Shashank S; et al (December 2025, Computer Methods and Programs in Biomedicine)

Background and Objectives: The variability and biases in the real-world performance benchmarking of deep learning models for medical imaging compromise their trustworthiness for real-world deployment. The common approach of holding out a single fixed test set fails to quantify the variance in the estimation of test performance metrics. This study introduces NACHOS (Nested and Automated Cross-validation and Hyperparameter Optimization using Supercomputing) to reduce and quantify the variance of test performance metrics of deep learning models. Methods: NACHOS integrates Nested Cross-Validation (NCV) and Automated Hyperparameter Optimization (AHPO) within a parallelized high-performance computing (HPC) framework. NACHOS was demonstrated on a chest X-ray repository and an Optical Coherence Tomography (OCT) dataset under multiple data partitioning schemes. Beyond performance estimation, DACHOS (Deployment with Automated Cross-validation and Hyperparameter Optimization using Supercomputing) is introduced to leverage AHPO and cross-validation to build the final model on the full dataset, improving expected deployment performance. Results: The findings underscore the importance of NCV in quantifying and reducing estimation variance, AHPO in optimizing hyperparameters consistently across test folds, and HPC in ensuring computational feasibility. Conclusions: By integrating these methodologies, NACHOS and DACHOS provide a scalable, reproducible, and trustworthy framework for DL model evaluation and deployment in medical imaging. To maximize public availability, the full open-source codebase is provided at https://github.com/thepanlab/NACHOS.
more » « less
Free, publicly-accessible full text available December 1, 2026
Deep Learning for Autonomous Surgical Guidance Using 3‐Dimensional Images From Forward‐Viewing Endoscopic Optical Coherence Tomography

https://doi.org/10.1002/jbio.202500181

Ly, Sinaro; Badré, Adrien; Brandt, Parker; Wang, Chen; Calle, Paul; Reynolds, Justin; Zhang, Qinghao; Fung, Kar‐Ming; Cui, Haoyang; Yu, Zhongxin; et al (July 2025, Journal of Biophotonics)

ABSTRACT A three‐dimensional convolutional neural network (3D‐CNN) was developed for the analysis of volumetric optical coherence tomography (OCT) images to enhance endoscopic guidance during percutaneous nephrostomy. The model was performance‐benchmarked using a 10‐fold nested cross‐validation procedure and achieved an average test accuracy of 90.57% across a dataset of 10 porcine kidneys. This performance significantly exceeded that of 2D‐CNN models that attained average test accuracies ranging from 85.63% to 88.22% using 1, 10, or 100 radial sections extracted from the 3D OCT volumes. The 3D‐CNN (~12 million parameters) was benchmarked against three state‐of‐the‐art volumetric architectures: the 3D Vision Transformer (3D‐ViT, ~45 million parameters), 3D‐DenseNet121 (~12 million parameters), and the Multi‐plane and Multi‐slice Transformer (M3T, ~29 million parameters). While these models achieved comparable inferencing accuracy, the 3D‐CNN exhibited lower inference latency (33 ms) than 3D‐ViT (86 ms), 3D‐DenseNet121 (58 ms), and M3T (93 ms), representing a critical advantage for real‐time surgical guidance applications. These results demonstrate the 3D‐CNN's capability as a powerful and practical tool for computer‐aided diagnosis in OCT‐guided surgical interventions.
more » « less
Free, publicly-accessible full text available July 25, 2026
Intelligent platform for needle-based intervention (NBI) guidance (Conference Presentation)

https://doi.org/10.1117/12.3051440

Tang, Qinggong; Liu, Yunlong; Calle, Paul; Zhang, Qinghao; Selvaraj_Mercyshalinie, Ebenezer Raj; Fung, Kar-Ming; Pan, Chongle (March 2025, SPIE)
Boudoux, Caroline; Tunnell, James W (Ed.)
Free, publicly-accessible full text available March 20, 2026
Integration of nested cross-validation, automated hyperparameter optimization, high-performance computing to reduce and quantify the variance of test performance estimation of deep learning models

Calle, Paul; Bates, Averi; Reynolds, Justin; Liu, Yunlong; Cui, Haoyang; Ly, Sinaro; Wang, Chen; Zhang, Qinghao; Armendi, Alberto; Shettar, Shashank; et al (March 2025, arXivorg)

The variability and biases in the real-world performance benchmarking of deep learning models for medical imaging compromise their trustworthiness for real-world deployment. The common approach of holding out a single fixed test set fails to quantify the variance in the estimation of test performance metrics. This study introduces NACHOS (Nested and Automated Cross-validation and Hyperparameter Optimization using Supercomputing) to reduce and quantify the variance of test performance metrics of deep learning models. NACHOS integrates Nested Cross-Validation (NCV) and Automated Hyperparameter Optimization (AHPO) within a parallelized high-performance computing (HPC) framework. NACHOS was demonstrated on a chest X-ray repository and an Optical Coherence Tomography (OCT) dataset under multiple data partitioning schemes. Beyond performance estimation, DACHOS (Deployment with Automated Cross-validation and Hyperparameter Optimization using Supercomputing) is introduced to leverage AHPO and cross-validation to build the final model on the full dataset, improving expected deployment performance. The findings underscore the importance of NCV in quantifying and reducing estimation variance, AHPO in optimizing hyperparameters consistently across test folds, and HPC in ensuring computational feasibility. By integrating these methodologies, NACHOS and DACHOS provide a scalable, reproducible, and trustworthy framework for DL model evaluation and deployment in medical imaging.
more » « less
Free, publicly-accessible full text available March 11, 2026
Deep learning aided epidural needle guidance using a forward-view polarization-sensitive optical coherence tomography probe

https://doi.org/10.1117/12.3034643

Wang, Chen; Liu, Yunlong; Calle, Paul; Zhang, Qinghao; Yan, Feng; Fung, Kar-Ming; Chen, Sixia; Pan, Chongle; Tang, Qinggong (October 2024, SPIE)
Applegate, Brian E; Tkaczyk, Tomasz S (Ed.)
Full Text Available
AI-Enabled Personalized Smoking Cessation Intervention With the Aipaca Chatbot: Mixed Methods Feasibility Study

https://doi.org/10.2196/73319

Liu, Yunlong; Calle, Paul; Vadakekut, Mariah; Rubin, Daniel; Nagykaldi, Zsolt; Doescher, Mark; Hightow-Weidman, Lisa; Pan, Chongle; Shao, Ruosi (January 2025, JMIR Formative Research)

BackgroundTobacco use remains the leading cause of preventable mortality in the United States; yet, evidence-based cessation services remain underused due to staffing constraints, limited access to counseling, and competing clinical priorities. Generative artificial intelligence (GenAI) chatbots may address these barriers by delivering personalized, guideline-aligned counseling through naturalistic dialogue. However, little is known about how GenAI chatbots support smoking cessation at both outcome and communication process levels. ObjectiveThis feasibility study evaluated the implementation of an evidence-based smoking cessation counseling session delivered by a GenAI-powered chatbot, Aipaca. We examined (1) pre-post changes in cessation preparedness, (2) communication dynamics during counseling sessions, and (3) user perceptions of the chatbot’s value, limitations, and design needs. MethodsWe conducted an observational, single-arm, mixed methods study with 29 adult smokers. Participants completed pre-post surveys measuring knowledge of smoking-related health risks and cessation methods, self-efficacy, and readiness to quit. Each engaged in a 30-minute text-based counseling session with Aipaca, powered by GPT-4 and structured using the 5A’s framework (Ask, Advise, Assess, Assist, Arrange). Sessions were transcribed for microsequential conversation analysis. Twenty-five participants completed semistructured interviews exploring perceived value, challenges, and design suggestions. Quantitative data were analyzed with paired-samples t tests, qualitative data were thematically analyzed, and transcripts were analyzed for interactional practices. The methodological strength of this study lies in its triangulated approach, which combines quantitative measurement of intervention effectiveness, qualitative analysis of user interviews, and conversational analysis of counseling transcripts to generate a comprehensive understanding of both outcomes and underlying mechanisms. ResultsParticipants demonstrated significant improvements in all preparedness indicators: knowledge of health risks, knowledge of cessation methods, self-efficacy, and readiness to quit. Conversation analysis identified three recurrent patterns enabling counseling-relevant dynamics: (1) contextual referencing and continuity, (2) formulations with elaboration prompts, and (3) narrative progression toward collaborative planning. Interview themes underscored Aipaca’s perceived value as an accessible, nonjudgmental, and motivating resource, capable of delivering personalized and interactive support. Criticisms included limited accountability, reduced cultural resonance, and overly goal-directed style. Participants emphasized design needs such as proactive engagement, gamified progress tracking, empathetic or anthropomorphic personas, and safeguards for accuracy. ConclusionsThis mixed methods feasibility study demonstrates that GenAI can deliver evidence-based smoking cessation counseling with measurable short-term gains in cessation preparedness and process-level communication patterns consistent with motivational interviewing. Users valued Aipaca’s accessibility, empathy, and personalization, while also articulating expectations for richer social roles and long-term accountability. Findings highlight both the promise and challenges of integrating GenAI into digital health: pairing adaptive language generation with human-centered design, embedding accuracy safeguards, and ensuring integration into multilevel cessation infrastructures will be essential for future clinical deployment.
more » « less
Full Text Available
Epidural needle guidance using a forward-view polarization-sensitive optical coherence tomography probe

https://doi.org/10.1117/12.3005243

Wang, Chen; Liu, Yunlong; Calle, Paul; Zhang, Qinghao; Yan, Feng; Selvaraj Mercyshalinie, Ebenezer Raj; Fung, Kar-Ming; Chen, Sixia; Pan, Chongle; Tang, Qinggong (March 2024, SPIE)
Izatt, Joseph A.; Fujimoto, James G. (Ed.)
Enhancing epidural needle guidance using a polarization‐sensitive optical coherence tomography probe with convolutional neural networks

https://doi.org/10.1002/jbio.202300330

Wang, Chen; Liu, Yunlong; Calle, Paul; Li, Xinwei; Liu, Ronghao; Zhang, Qinghao; Yan, Feng; Fung, Kar‐ming; Conner, Andrew K.; Chen, Sixia; et al (October 2023, Journal of Biophotonics)

Abstract Epidural anesthesia helps manage pain during different surgeries. Nonetheless, the precise placement of the epidural needle remains a challenge. In this study, we developed a probe based on polarization‐sensitive optical coherence tomography (PS‐OCT) to enhance the epidural anesthesia needle placement. The probe was tested on six porcine spinal samples. The multimodal imaging guidance used the OCT intensity mode and three distinct PS‐OCT modes: (1) phase retardation, (2) optic axis, and (3) degree of polarization uniformity (DOPU). Each mode enabled the classification of different epidural tissues through distinct imaging characteristics. To further streamline the tissue recognition procedure, convolutional neural network (CNN) were used to autonomously identify the tissue types within the probe's field of view. ResNet50 models were developed for all four imaging modes. DOPU imaging was found to provide the highest cross‐testing accuracy of 91.53%. These results showed the improved precision by PS‐OCT in guiding epidural anesthesia needle placement.
more » « less
Full Text Available
Feasibility of using polarization-sensitive optical coherence tomography (PS-OCT) in epidural anesthesia guidance (Conference Presentation)

https://doi.org/10.1117/12.2650474

Wang, Chen; Liu, Yunlong; Calle, Paul; Yan, Feng; de Armendi, Alberto J.; Shettar, Shashank S.; Fung, Kar-Ming; Pan, Chongle; Tang, Qinggong (March 2023, Proc. SPIE PC12368, Advanced Biomedical and Clinical Diagnostic and Surgical Guidance Systems XXI, PC1236802)
Boudoux, Caroline; Tunnell, James W. (Ed.)
Full Text Available
Octascope: A Lightweight Pre-Trained Model for Optical Coherence Tomography

https://doi.org/10.1109/ACCESS.2025.3595838

Cui, Haoyang; Wang, Chen; Calle, Paul; Liu, Yunlong; Zhang, Qinghao; Ly, Sinaro; Reynolds, Justin; Yan, Feng; Zhang, Ke; Liu, Ronghao; et al (August 2025, IEEE Access)

Optical coherence tomography (OCT) imaging enables high resolution visualization of sub-surface tissue microstructures. However, OCT image analysis using deep learning is hampered by limited diverse training data to meet performance requirements and high inference latency for real-time applications. To address these challenges, we developed Octascope, a lightweight domain-specific convolutional neural network (CNN) - based model designed for OCT image analysis. Octascope was pre-trained using a curriculum learning approach, which involves sequential training, first on natural images (ImageNet), then on OCT images from retinal, abdominal, and renal tissues, to progressively acquire transferable knowledge. This multi-domain pre-training enables Octascope to generalize across varied tissue types. In two downstream tasks, Octascope demonstrated notable improvements in predictive accuracy compared to alternative approaches. In the epidural tissue detection task, our method surpassed single-task learning with fine-tuning by 9.13% and OCT-specific transfer learning by 5.95% in accuracy. Octascope outperformed VGG16 and ResNet50 by 5.36% and 6.66% in a retinal diagnosis task, respectively. In comparison to a Transformer-based OCT foundation model - RETFound, Octascope delivered 2 to 4.4 times faster inference speed with slightly better predictive accuracies in both downstream tasks. Octascope represented a significant advancement for OCT image analysis by providing an effective balance between computational efficiency and diagnostic accuracy for real-time clinical applications.
more » « less
Free, publicly-accessible full text available August 5, 2026

« Prev Next »

Search for: All records